
ANR: 957881
The forex market plays a key role in daily management of multinational companies operating in different currencies. Its volume surpasses any other market, exceeding the amount of $5 tn per day. Its global centre is located in London, where around 40 per cent of all transactions are executed.
International trade and global investing nourish and heavily rely on it. Its functioning is critical in order to support imports and exports, and consequently to incentive the exchange of resources and the creation of additional demand of goods and services. Without its current liquidity companies potential would be limited and global economic growth would be damaged.
With this said, investors also benefit from the foreign exchange market. Need of diversification comes with the necesity of currency exchange in many occasions to buy and sell foreign assets and/or securities. Equally, some investors may see currencies as an asset class itself and trade them to generate returns. The latter is precisely what this assignment tries to obtain. Nevertheless, a good prediction of FX rates is of great help to multinational companies entering exchange risks so that they can have an effective hedging to currencies' volatility.
Interactions between dealers occur in a global over-the-counter (OTC) network that connects buyers and sellers. There is no single exchange, however, there is a big interconnectivity between marketplaces. A result of such structure is the difference in rates between banks or market makers that leaves room for arbitrage opportunities, specially in low volume currencies. In this assignment we assume the rates obtained from the data-sources are available prices for those exchanges and that the real application of the results would depend on the platform we trade, and, consequently, on its prices.
In terms of currency distribution in the FX markets:
| Rank | Currency | Share | </tr>||||||||||||||||||||||||||||||
| 1 | |
87.6% | </tr>||||||||||||||||||||||||||||||
| 2 | |
31.4% | </tr>||||||||||||||||||||||||||||||
| 3 | |
21.6% | </tr>||||||||||||||||||||||||||||||
| 4 | |
12.8% | </tr>||||||||||||||||||||||||||||||
| 5 | |
6.9% | </tr> </tbody> </table>||||||||||||||||||||||||||||||
| Predicted | |||
| 1 (Buy) | 0 (Sell or Hold) | ||
|
Actual
|
1 (Buy) | 501 | 394 |
| 0 (Sell or Hold) | 185 | 1344 | |
Now we are ready to backtest the algorithm with real unseen data.
# Calculate equity..
contracts = 10000.0
commission = 0.0
df_trade = pd.DataFrame(X_test, columns=['Return nominal'])
df_trade['Label'] = y_test
df_trade['Pred'] = predictions
df_trade['Won'] = df_trade['Label'] == df_trade['Pred']
df_trade.drop(df_trade.index[len(df_trade)-1], inplace=True)
#Initial capital 10k USD, we assume we reinvest earnings.
df_trade['Capital']=10000
df_trade['Pos Hit']=""
Capital=10000
curr=''
pos=0
neg=0
for i in range(0,len(df_trade['Pred'])-1):
#Buy signal, buy JPY.
if df_trade['Pred'].iloc[i]==1:
curr='JPY'
#Return if buy/hold JPY:
df_trade['Capital'].iloc[i+1]=df_trade['Capital'].iloc[i]*(1+df_trade['Return nominal'][i])
#Count of profit and loss trades:
if np.sign(df_trade['Return nominal'][i])>0:
pos=pos+1
else:
neg=neg+1
#If sell signal, we sell JPY and hold USD.No return.
elif df_trade['Pred'].iloc[i]==-1 and curr=='JPY':
curr='USD'
df_trade['Capital'].iloc[i+1]=df_trade['Capital'].iloc[i]
#If we hold USD and sell signal we keep holding USD.
elif df_trade['Pred'].iloc[i]==-1 and curr=='USD':
curr='USD'
df_trade['Capital'].iloc[i+1]=df_trade['Capital'].iloc[i]
print('Positive trades: ',pos,'Negative trades: ',neg)
#Profit function if selling contracts. Not used as we analyse the position of a US investor
def calc_profit(row):
if row['Won']:
return abs(row['Return nominal'])*contracts - commission
else:
return -abs(row['Return nominal'])*contracts - commission
df_trade['Pnl'] = df_trade.apply(lambda row: calc_profit(row), axis=1)
#If we could sell and buy 3 day contracts.
df_trade['Equity contracts'] = df_trade['Pnl'].cumsum()
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
return false;
}
#Avoid scrolling output(Above)
#We run the backtesting and plot it in simulation:
import io
import numpy
import matplotlib.pyplot as plt
from matplotlib import animation
import datetime
from numpy import genfromtxt
from IPython.display import HTML
from matplotlib import style
style.use('fivethirtyeight')
x = df_trade.index.tolist()
y = df_trade['Capital'].tolist()
fig, ax = plt.subplots(figsize=(13, 7))
line, = ax.plot([], [], 'b-')
ax.margins(0.05)
plt.xlabel('Trading day')
plt.ylabel('Capital')
plt.title('Backtesting')
def init():
line.set_data(x,y)
return line,
def animate(i):
imin = 0 #min(max(0, i - win), x.size - win)
xdata = x[imin:i+2]
ydata = y[imin:i+2]
line.set_data(xdata, ydata)
ax.relim()
ax.autoscale()
return line,
plt.tight_layout()
anim = animation.FuncAnimation(fig, animate, frames=len(df_trade['Capital']),init_func=init, interval=50)
plt.close()
HTML(anim.to_html5_video())
Note: Now we are going to perform the same process to the EUR. If you want to jump to the final analysis click here.
#Same analysis, against the EUR
df_technical_1=quandl.get("FED/RXI_US_N_B_EU", end_date="2017-12-31")
df_technical_1['Return Nominal'] =0
df_technical_1['Return Nominal'] = df_technical_1/df_technical_1.shift(-1)-1
ax1=df_technical_1.plot(y='Return Nominal', figsize=(10,4),lw=1,ylim=(-0.04,0.04))
ax1.set_title("EUR Return")
ax1.set_xlabel("Date")
ax1.set_ylabel("Daily %")
plt.show()
df_technical_1['Dummy'] = df_technical_1['Return Nominal']
df_technical_1['Dummy'] = df_technical_1['Dummy'].apply(lambda x: 1 if x>0.0 else 0)
hm_days=3
for i in range(1,hm_days+1):
df_technical_1['Feature_{}d'.format(i)] = (
df_technical_1['Return Nominal']/((df_technical_1['Return Nominal']).shift(-i))-1)
df_technical.fillna(0, inplace=True)
for i in range(1,hm_days+1):
df_technical_1['EURUSD_target'] = list(map(buy_sell_hold,df_technical_1['Feature_{}d'.format(i)]))
df_technical_1['EURUSD_target'].value_counts()
df_vals_1 = df_technical_1['Return Nominal']
df_vals_1 = df_vals_1.replace([np.inf, -np.inf], 0)
df_vals_1.fillna(0, inplace=True)
df_technical_1['Real Return Sign'] = df_technical_1['Return Nominal']
df_technical_1['Real Return Sign'] = df_technical_1['Real Return Sign'].apply(lambda x: 1 if x>0.0 else -1)
X1 = df_vals_1.values
y1 = df_technical_1['EURUSD_target'].values
y1=np.nan_to_num(y1)
train_size_1 = int(len(X1) * 0.80)
y_real_labels_1=df_technical_1['Real Return Sign'][train_size_1:len(y1)]
X_train1, X_test1 = X1[0:train_size_1], X1[train_size_1:len(X1)]
y_train1, y_test1 = y1[0:train_size_1], y1[train_size_1:len(y1)]
clf1 = VotingClassifier([('lsvc',svm.LinearSVC()),
('knn',neighbors.KNeighborsClassifier()),
('rfor',RandomForestClassifier())])
clf1.fit(X_train1.reshape(-1, 1), y_train1)
confidence_1 = clf1.score(X_test1.reshape(-1, 1), y_test1)
predictions_1 = clf1.predict(X_test1.reshape(-1, 1))
display('Confusion matrix:', confusion_matrix(y_real_labels_1, predictions_1,labels=[1,-1]))
print('Accuracy 3d Target:',confidence_1)
print('Accuracy 1d Target: ',accuracy_score(y_real_labels_1, predictions_1))
print('Predicted class counts:',Counter(predictions_1))
print('Real class counts: {}'.format(Counter(y_real_labels_1)))
print('Recall: ' ,recall_score(y_real_labels_1, predictions_1,pos_label=1))
print('Precision: ' ,precision_score(y_real_labels_1, predictions_1,pos_label=1))
contracts = 10000.0
commission = 0.0
df_trade1 = pd.DataFrame(X_test1, columns=['Return nominal'])
df_trade1['Label'] = y_test1
df_trade1['Pred'] = predictions_1
df_trade1['Won'] = df_trade1['Label'] == df_trade1['Pred']
df_trade1.drop(df_trade1.index[len(df_trade1)-1], inplace=True)
df_trade1['Capital']=10000
Capital=10000
curr=''
pos1=0
neg1=0
for i in range(0,len(df_trade1['Pred'])-1):
if df_trade1['Pred'].iloc[i]==1:
curr='EUR'
df_trade1['Capital'].iloc[i+1]=df_trade1['Capital'].iloc[i]*(1+df_trade1['Return nominal'][i])
if np.sign(df_trade1['Return nominal'][i])>0:
pos1=pos1+1
else:
neg1=neg1+1
elif df_trade1['Pred'].iloc[i]==-1 and curr=='EUR':
curr='USD'
df_trade1['Capital'].iloc[i+1]=df_trade1['Capital'].iloc[i]
elif df_trade1['Pred'].iloc[i]==-1 and curr=='USD':
curr='USD'
df_trade1['Capital'].iloc[i+1]=df_trade1['Capital'].iloc[i]
print('Positive trades: ',pos1,'Negative trades: ',neg1)
df_trade1['Pnl'] = df_trade1.apply(lambda row: calc_profit(row), axis=1)
df_trade1['Equity contracts'] = df_trade1['Pnl'].cumsum()
Confusion Matrix for EUR
| Predicted | |||
| 1 (Buy) | 0 (Sell or Hold) | ||
|
Actual
|
1 (Buy) | 63 | 425 |
| 0 (Sell or Hold) | 39 | 428 | |
We see a poor accuracy in daily trading, however, we keep a good enough precision (62%) in order to make sustainable returns. As we previously saw with the JPY, the 'recall' is lower, in the case of the EUR is extremely low (13%) what means we are missing most buying opportunities.
Let's see how it performs in the back-test!
#Running backtest against EUR
x1 = df_trade1.index.tolist()
y1 = df_trade1['Capital'].tolist()
fig, ax = plt.subplots(figsize=(13, 7))
line, = ax.plot([], [], 'b-')
ax.margins(0.05)
plt.xlabel('Trading day')
plt.ylabel('Capital')
plt.title('Backtesting')
def init():
line.set_data(x1,y1)
return line,
def animate(i):
imin = 0 #min(max(0, i - win), x.size - win)
xdata = x1[imin:i+2]
ydata = y1[imin:i+2]
line.set_data(xdata, ydata)
ax.relim()
ax.autoscale()
return line,
plt.tight_layout()
anim = animation.FuncAnimation(fig, animate, frames=len(df_trade1['Capital']),init_func=init, interval=50)
plt.close()
HTML(anim.to_html5_video())
# Calculate summary of trades against JPY and EUR.
#Function to round.
from math import ceil, floor
def float_round(num, places = 0, direction = floor):
return direction(num * (10**places)) / float(10**places)
print('#################### JPY Summary ######################################## EUR Summary ####################')
print('__________________________________________________________________________________________________________')
print("Net Profit : ", float_round((df_trade['Capital'].iloc[-1]/df_trade['Capital'].iloc[0])-1,2,round),
" "*int(len(str('########################################'))/2),
"Net Profit : ", float_round((df_trade1['Capital'].iloc[-1]/df_trade1['Capital'].iloc[0])-1,2,round)
)
total_dist=len(str("Net Profit : ")) \
+ len(str(float_round((df_trade['Capital'].iloc[-1]/df_trade['Capital'].iloc[0])-1,2,round))) \
+ len(str(" "*(int(len(str('########################################'))/2)+1)))
print('__________________________________________________________________________________________________________')
print("Number Winning Trades : %d" % pos,
" "*(total_dist-len(str("Number Winning Trades : %d" % pos))),
"Number Winning Trades : %d" % pos1
)
print('__________________________________________________________________________________________________________')
print("Number Losing Trades : %d" % neg,
" "*(total_dist-len(str("Number Losing Trades : %d" % neg))),
"Number Losing Trades : %d" % neg1
)
print('__________________________________________________________________________________________________________')
print("Percent Profitable : %.2f%%" % (100*pos/(pos + neg)),
" "*(total_dist-len(str("Percent Profitable : %.2f%%" % (100*pos/(pos + neg))))),
"Percent Profitable : %.2f%%" % (100*pos1/(pos1 + neg1))
)
print('__________________________________________________________________________________________________________')
len_following=len(str("Yearly return : ")) \
+len(str(float_round((df_trade['Capital'].iloc[-1]/df_trade['Capital'].iloc[0]-1)
/len(df_trade['Capital'])*365,2,round)))
print("Yearly return : " ,float_round((df_trade['Capital'].iloc[-1]/df_trade['Capital'].iloc[0]-1)
/len(df_trade['Capital'])*365,2,round),
" "*(total_dist-len_following-1),
"Yearly return : " ,float_round((df_trade1['Capital'].iloc[-1]/df_trade1['Capital'].iloc[0]-1)
/len(df_trade1['Capital'])*365,2,round)
)
df_trade['Difs']=0
for i in range(0,len(df_trade['Capital'])-1):
if i==0:
df_trade['Difs'].iloc[0]=0
else:
df_trade['Difs'].iloc[i]=(df_trade['Capital'].iloc[i]/df_trade['Capital'].iloc[i-1])-1
df_trade1['Difs']=0
for i in range(0,len(df_trade1['Capital'])-1):
if i==0:
df_trade1['Difs'].iloc[0]=0
else:
df_trade1['Difs'].iloc[i]=(df_trade1['Capital'].iloc[i]/df_trade1['Capital'].iloc[i-1])-1
print('__________________________________________________________________________________________________________')
len_following=len(str("Avg Win Trade : %.3f%%" % df_trade[df_trade['Difs']>0.0]['Difs'].mean()))
print("Avg Win Trade : %.3f%%" % df_trade[df_trade['Difs']>0.0]['Difs'].mean(),
" "*(total_dist-len_following),
"Avg Win Trade : %.3f%%" % df_trade1[df_trade1['Difs']>0.0]['Difs'].mean()
)
print('__________________________________________________________________________________________________________')
len_following=len(str("Avg Loss Trade : %.3f%%" % df_trade[df_trade['Difs']<0.0]['Difs'].mean()))
print("Avg Loss Trade : %.3f%%" % df_trade[df_trade['Difs']<0.0]['Difs'].mean(),
" "*(total_dist-len_following),
"Avg Loss Trade : %.3f%%" % df_trade1[df_trade1['Difs']<0.0]['Difs'].mean()
)
print('__________________________________________________________________________________________________________')
len_following=len(("Largest Win Trade : %.3f%%" % df_trade[df_trade['Difs']>0.0]['Difs'].max()))
print("Largest Win Trade : %.3f%%" % df_trade[df_trade['Difs']>0.0]['Difs'].max(),
" "*(total_dist-len_following),
"Largest Win Trade : %.3f%%" % df_trade1[df_trade1['Difs']>0.0]['Difs'].max()
)
print('__________________________________________________________________________________________________________')
len_following=len(("Largest Loss Trade : %.3f%%" % df_trade[df_trade['Difs']<0.0]['Difs'].min()))
print("Largest Loss Trade : %.3f%%" % df_trade[df_trade['Difs']<0.0]['Difs'].min(),
" "*(total_dist-len_following),
"Largest Loss Trade : %.3f%%" % df_trade1[df_trade1['Difs']<0.0]['Difs'].min()
)
print('__________________________________________________________________________________________________________')
len_following=len(("Profit Factor : %.2f" % abs(df_trade[df_trade['Difs']>0.0]['Difs'].sum()
/df_trade[df_trade['Difs']<0.0]['Difs'].sum())))
print("Profit Factor : %.2f" % abs(df_trade[df_trade['Difs']>0.0]['Difs'].sum()
/df_trade[df_trade['Difs']<0.0]['Difs'].sum()),
" "*(total_dist-len_following),
"Profit Factor : %.2f" % abs(df_trade1[df_trade1['Difs']>0.0]['Difs'].sum()
/df_trade1[df_trade1['Difs']<0.0]['Difs'].sum()),
)
print('__________________________________________________________________________________________________________')
len_following=len(("Trading days : %.0f" % abs(len(df_trade['Capital']))))
print(("Trading days : %.0f" % abs(len(df_trade['Capital']))),
" "*(total_dist-len_following),
("Trading days : %.0f" % abs(len(df_trade1['Capital'])))
)
print('__________________________________________________________________________________________________________')
len_following=len(("Portfolio value ($) : %.0f" % abs(df_trade['Capital'].iloc[-1])))
print(("Portfolio value ($) : %.0f" % abs(df_trade['Capital'].iloc[-1])),
" "*(total_dist-len_following),
("Portfolio value ($) : %.0f" % abs(df_trade1['Capital'].iloc[-1]))
)
print('__________________________________________________________________________________________________________')
#Plot distributions trading algo vs Historical
fig = plt.figure(figsize=(13, 7))
plt.subplot(2, 2,1)
plt.hist(df_trade['Difs'],bins=35,color='#1286d6')
plt.xlim(-0.025,0.025)
plt.ylim(0,120)
plt.axvline(0, color='#34402d', linestyle='dashed', linewidth=1.5)
plt.title('Histogram: Predictor Returns on JPY')
plt.ylabel('Frequency')
plt.xlabel('Return in %')
plt.subplot(2, 2,2)
plt.hist(df_technical['Return Nominal'][train_size:-1],bins=50, color='#57bf22')
plt.xlim(-0.025,0.025)
plt.ylim(0,400)
plt.axvline(0, color='#34402d', linestyle='dashed', linewidth=1.5)
plt.title('Histogram: JPY Returns')
plt.ylabel('Frequency')
plt.xlabel('Return in %')
plt.subplot(2, 2,3)
plt.hist(df_trade1['Difs'],bins=20,color='#1286d6')
plt.xlim(-0.025,0.025)
plt.ylim(0,25)
plt.axvline(0, color='#34402d', linestyle='dashed', linewidth=1.5)
plt.title('Histogram: Predictor Returns on EUR')
plt.ylabel('Frequency')
plt.xlabel('Return in %')
plt.subplot(2, 2,4)
plt.hist(df_technical_1['Return Nominal'][train_size_1:-1],bins=50, color='#57bf22')
plt.xlim(-0.025,0.025)
plt.ylim(0,150)
plt.axvline(0, color='#34402d', linestyle='dashed', linewidth=1.5)
plt.title('Histogram: EUR Returns')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()
Refer to above.
Our portfolio of 10k USD performs very good against the JPY, we see that 72% of our operations are profitable ('precision' metric) and our distribution is positively skewed compared to the historical returns. After 2423 trading days, our portfolio value is 132k USD.
On the other hand, the performance against the EUR is not that successful. We make profit 62% of the times we enter a trade. Fortunately, the returns are positively skewed, meaning that in average our profit trades are larger than our loss trades and that, eventually, reflects in an overall positive return. With a final portfolio value of 12k USD after 954 trading days. We can see it again following what the metrics tell us. We saw a very low 'recall', thus, we expect to miss many positive trends. However, our precision is above baselines, what means that when we enter a trade we are quite certain that is going to be positive. That is why we train our algorithm with 3 days horizon, to surely know that when we enter a buy trade, the trend will be strong enough to deliver positive return.
The algorithm in general terms performs optimally, specially considering a historical negative skewness of both currencies in terms of returns.
To finalize our analysis, we can visualize our performance in both currencies:
#Plot the backtest with our predictions and an initial capital of 10k USD
fig = plt.figure(figsize=(13, 14))
plt.subplot(3, 1,1)
plt.plot(df_trade['Capital'],lw=1.5,zorder=15,color='#0a2bb1')
plt.title('Backtest with $10000 initial capital against JPY')
plt.xlabel('Trades')
plt.ylabel('Capital (USD)')
plt.xlim(0,len(df_trade['Capital']))
#Plot green if pos, red if neg
for r in range(0,len(df_trade['Difs'])):
if df_trade['Difs'][r]>0:
plt.axvline(x=r, linewidth=0.5, alpha=0.7, color='g')
elif df_trade['Difs'][r]<0:
plt.axvline(x=r, linewidth=0.5, alpha=0.7, color='r')
plt.subplot(3, 1,2)
plt.plot(df_trade['Capital'][:len(df_trade1['Capital'])],lw=1.5,zorder=15,color='#0a2bb1')
plt.title('Backtest with $10000 initial capital against JPY \n Same trading days as EUR')
plt.xlabel('Trades')
plt.ylabel('Capital (USD)')
plt.xlim(0,len(df_trade['Capital'][:len(df_trade1['Capital'])]))
for r in range(0,len(df_trade['Difs'])):
if df_trade['Difs'][r]>0:
plt.axvline(x=r, linewidth=0.5, alpha=0.7, color='g')
elif df_trade['Difs'][r]<0:
plt.axvline(x=r, linewidth=0.5, alpha=0.7, color='r')
plt.subplot(3, 1,3)
plt.plot(df_trade1['Capital'],lw=1.5)
plt.title('Backtest with $10000 initial capital against EUR')
plt.xlabel('Trades')
plt.ylabel('Capital (USD)')
plt.xlim(0,len(df_trade1['Capital']))
for r in range(0,len(df_trade1['Difs'])):
if df_trade1['Difs'][r]>0:
plt.axvline(x=r, linewidth=0.5, alpha=0.7, color='g')
elif df_trade1['Difs'][r]<0:
plt.axvline(x=r, linewidth=0.5, alpha=0.7, color='r')
plt.tight_layout()
plt.show()
In the graphs above we can observe how few trades we execute (see green and red lines) in the EUR/USD pair compared to the JPY/USD. This is due to our low 'recall' with the EUR, staying conservative and entering the market only when we are certain of a positive return.
After an exploratory analysis of what machine learning can offer in terms of pattern recognition, we may look towards potential improvements to guarantee robustness in our predictions. This said, cross-validation for time series techniques seeking a better tuning and a more solid performance on unseen data can be included as a major improvement. Furthermore, the inclusion of more informative features such as volumes, averages or fundamental shocks may as well improve its performance.
Thank you for reading it!